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ABSTRACT 


This  paper  presents  a  linear  programming  approach  to  solve  simple 
linear  regression  problems  with  the  least  absolute  value  criterion.  The 
solution  technique  uses  linear  programming  with  an  extended  minimum 
ratio  rule.  A  computational  study  indicates  the  efficiency  of  the  algo¬ 
rithm. 


KEY  WORDS 

Simple  Linear  Regression  Problem 
Least  Absolute  Value  Regression  Problem 
Goal  Programming 
Linear  Programing 


-1- 


1  •  Introduction 

The  simple  linear  regression  problem  arises  from  a  fundamental  model 
of  statistical  analysis.  The  model  consists  of  an  independent  (also 
known  as  predictor)  random  variable  which  is  used  to  determine  the  value 
of  the  dependent  (or  response)  random  variable.  The  simple  linear 
regression  fit  has  been  widely  used  in  statistical  and  economic  forecast¬ 
ings.  The  simple  linear  regression  probelm  is  to  find  the  linear  equation 
which  will  fit  the  data  comprising  of  these  two  variables. 

The  simple  regression  problem  with  a  least  absolute  value  criterion 
has  the  following  form. 

n 

(1)  Minimize  E  |  y.  -  n  -  x.tf| 

(a,  tf)  i=l  1  1 

where  (x^,  y^,  1*1,  2,  ....  n  are  the  observed  values. 


2 .  Algor i thm 

Problem  (1)  is  equivalent  to  the  following  linear  programming  problem 


[see  4) : 

n 

Minimize  Z  (P.  +  N. ) 

i  =  l  1  1 


(2)  subject  to  a  +  x^fl  +  P.  -  fT  =  yi ,  i=l,  2.  ....  n 

P^  >  0  and  N.  >  0,  i=  1,  2,  ....  n 


where  P.  and  N.  are,  respectively,  the  positive  and  negative  deviation 
associated  with  the  1-th  observation. 

The  dual  problem  of  (2)  is: 

n 

Maximize  Z  n.y. 

1  =  1  1  1 
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n 

(3)  subject  to  }'  n.  ■  0 

1-1  1 

n 

£  n,x ■  0 
1-1  1  1 

-1  <  <  1,  1-1,  2,  ....  n 

He  shall  exploit  the  structure  of  (3)  to  solve  this  problem  with  a 
dual  simplex  algorithm.  This  process  Is  Identical  to  solving  the  problem 
(2)  with  the  primal  simplex  algorithm.  The  following  presentation  develops 
a  special  purpose  algorithm  using  the  revised  simplex  method  on  the  pri¬ 
mal  problem  (2)  with  a  multiple  pivot  strategy.  This  strategy  enables 
the  method  to  perform  a  pivot  through  several  bases  In  one  Iteration. 

Initially  we  choose  two  observations  (xc*  yQ)  and  (xd§  yd)  such  that 
*c  ^  V  Hence,  the  current  basis  for  the  IP  problem  Is: 


The  current  right  hand  side  is: 


By  the  adjoint  formof  the  Inverse,  the  initial  basis  Inverse  Is  given 
as  follows: 


The  solution  of  (2)  can  be  calculated: 
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/a\  .  _i__  /-Vc  +  Vd\ 

W  xc'xd  \  yc  -  yd  / 

Let  NB  represent  the  index  set  of  the  nonbasic  rows.  The  deviations 
for  (2),  or  the  reduced  costs  for  (3),  are  given  by: 

d.  =  y .  -  a  -  flx . 

Oefine  u.  =  sign  (d^),  i  e NB.  In  the  computer  code,  the  value  assigned 
to  it.  when  i  c  NB  and  d^  =  0  is  arbitrarily  defined  to  be  +1  and,  there¬ 
after,  the  value  is  determined  by  the  steps  of  the  algorithm.  The 
situation  where  d,.  =  0  and  i  c  NB  corresponds  to  the  case  of  degeneracy 
in  linear  programming  and  can  be  resolved  as  described  by  Charnes  |3). 

The  details  of  this  procedure  will  not  be  discussed  here. 

The  nonbasic  dual  variables  are  either  +1  or  -I  depending  on  the 


Since  this  is  a  primal  algorithm,  the  optimality  condition  for  (2) 
Is  dual  feasibility,  namely,  -1  <  tt.  <  1,  1  =  c,  d. 

If  | tt  |  >  1,  the  basic  dual  variable,  n  ,  will  leave  the  basis.  If 

c 

| nc |  <1,  and  | |  s  1,  the  algorithm  Interchanges  the  indices  c  and  d. 
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Thus  the  variable  to  leave  the  basis  Mill  always  be  nc.  Define  p  =  sign 

(itc).  The  value  of  p  Indicates  If  nc  Is  to  be  Increased  or  decreased.  A 

value,  p  ■  +1  If  n  Is  to  be  decreased  and  p  *  -1  If  u  is  to  be  Increased, 
c  c 

The  algorithm  then  determines  a  nonbaslc  dual  variable  to  enter  the 
basis.  The  procedure  Is  to  take  the  minimum  value  from  a  list  of  ratios: 

(4)  y,  -  a -6x, 

Q1  *  — pi -  for  *1^1  >  leNB, 

X 

where  ieNB. 

1  xc  d 

Suppose  e$  Is  the  smallest  ratio  value,  then  tts«  seNB,  is  the  non- 
basic  dual  variable  to  be  examined.  Firstly,  the  algorithm  checks  if  v$ 
will  enter  the  basis  at  a  dual  feasible  level  by  the  following  criterion: 

(5) 

\\\ -2|es|  <  i. 

If  condition  (5)  is  satisfied,  tts  will  enter  the  basis  and  will 
be  dual  feasible.  The  algorithm  assigns  c  to  be  the  current  value  of  d. 
Then,  d  is  given  the  value  of  s.  For  example,  if  and  n,.  are  currently 
basic,  and  is  leaving  the  basis,  and  If  the  incoming  nonbasic  varia¬ 
ble  Is  Ttjj,  the  new  basic  variables  will  be  and  respectively. 

On  the  other  hand.  If  condition  (5)  is  not  satisfied,  ir$  will  remain 
as  a  nonbaslc  dual  variable  because  by  bringing  Into  the  basis,  tt$ 
will  still  be  dual  Infeasible  for  (2).  Rather,  it  will  switch  from  its 
current  bound  value  to  Its  opposite  bound  value.  Moreover  the  value  of 
the  basic  dual  variable  tt  will  be  Increased  (or  decreased)  by 

2p(C$|.  The  algorithm  then  eliminates  0$  from  the  list  of  ratios 
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and  examines  the  next  possible  candidate  to  enter  the  basis  from  (4). 

The  algorithm  repeats  the  above  procedure  until  -1  £  tk  £  +1, 
i=c,d  is  satisfied.  It  may  be  noted  that  to  check  this  condition  after 
the  first  iteration  only  tic  need  be  examined. 


3 •  Steps  of  the  algorithm 

In  this  section,  we  summarize  the  algorithm  by  giving  a  step-by- 
step  description.  New  notation,  such  as  D,  Tj,  T?,  and  T,  are  Introduced 
to  make  the  algorithm  easier  to  follow. 

1.  Initialization: 

Choose  two  observations  (xc«  yc)  and  (xd>  yrf)  such  that  xc  f  xd. 


Set  D  * 


a  = 


6  = 


(_Vc  +  Vd)D 

(yc  -  yd)D 

sign(yi  -  a  -  B1 ) 


i/c,d 


71 


c 


nd 


=  0 


T 


1 


n 

V 

1  =  1 


"i 


n 

T_  =  T.  71.  x. 

2  '  1 

T  ■  (xdT,  -  I2)0 

2.  If  | T |  >  1,  go  to  step  4.  Otherwise,  set  D  =  -D,  interchange 
ttc  and  7rd  by  setting  u«c,  c-»-d,  d*u,  continue. 

3.  T  =  (T2  -  xdT1)D 

If  | T |  >  1,  go  to  step  4.  Otherwise,  stop.  The  current  solution  is 
optimal . 
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designed  expressly  to  provide  least  absolute  value  estimates  for  a  simple 
linear  regression  model.  The  University  of  Texas  CDC  6600  was  used  in 
this  study.  The  observations  have  been  drawn  from  various  uniform  and 
normal  distributions  using  a  random  number  generator.  The  results  of  the 
study  are  summarized  in  Table  1. 

Our  study  has  indicated  that  this  specialization  of  the  linear  pro¬ 
gramming  approach  first  developed  by  Barrodale  and  Roberts  [2]  is  uni¬ 
formly  faster  than  the  Sadovski's  approach  on  all  problem  sizes.  In 
problems  with  more  than  300  observations,  the  SIMLP  is  approximately  50 
times  faster  than  LONESL.  Considerably  less  storage  is  required  in  SIMLP 
when  compared  to  the  Barrodale  and  Roberts'  code  and  approximately  the 
same  amount  when  compared  to  the  Sodovski's  code.  Also,  Sposito  [6] 
has  shown  that  LONESL  may  not  always  converge,  while  SIMLP  utilizes  the 
convergent  properties  of  linear  programming  theory.  Another  feature  of 
SIMLP  is  that  there  is  no  accumulative  roundoff  error  present  since  all 
necessary  values  are  recalculated  from  the  original  data  at  each  iteration. 

5.  Conclusion 

This  paper  presents  a  special  purpose  algorithm  to  solve  simple 
least  absolute  value  regression  problems.  The  approach  utilizes  the 
characteristics  and  convergent  properties  of  linear  programming.  With 
the  addition  of  the  multiple  pivot  strategy  in  linear  programming,  the 
simple  least  absolute  value  regression  is  solved  efficiently.  From  the 
computational  results,  it  is  shown  that  the  code  presented  here  is  superior  to  the  pub¬ 
lished  code  of  Sadovski,  in  terms  of  solution  time.  A  listing  of  the 
computer  code  will  be  found  in  [1J. 
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Table  1  (  A  Comparison  between  SIMLP  and  LONESL  ) 


nber  of 

SIMLP 

LONESL 

iervations 

T  ime 

Number  of 

Time 

Number  of 

(CPU  mi  1 1 iseconds) 

iterations 

(CPU  milliseconds) 

i terations 

50 

7 

3 

57 

2 

100 

32 

6 

407 

4 

150 

46 

5 

953 

4 

200 

102 

5 

1647 

4 

250 

84 

2 

3230 

5 

300 

70 

3 

8373 

9 

350 

262 

3 

6367 

5 

400 

265 

3 

16296 

10 

450 

212 

5 

8341 

4 

500 

184 

3 

15374 

6 
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